Make sure you have Python installed, with the following packages:
pandasnetworkxmatplotlibhiveplotIPython HTML Notebook 3.0I recommend the Anaconda distribution of Python, for its ease of installation. You can download it here: https://store.continuum.io/cshop/anaconda/
You may wish to create a new environment for the tutorial. Using Anaconda, I would recommend doing:
conda create -n net_tutorial scipy numpy pandas networkx matplotlib ipython [notebook]
In Linux and Mac, to switch to that environment, you can do:
source activate net_tutorial
Windows users should do:
activate net_tutorial
Ensure that the packages as listed above are installed in this environment.
You will need to pair up!
These are the pairing criteria:
[s for s in my_fav_things if s[‘name’] == ‘raindrops on roses’]s and my_fav_things.Following this, behind your PyCon 2015 mini-business cards:
All your relational problems are belong to networks. :-)
Networks, a.k.a. graphs, are an immensely useful modelling tool to model complex relational problems.
Networks are comprised of two main entities:
Edges denote relationships between the nodes.
In a network, if two nodes are joined together by an edge, then they are neighbors of one another.
There are generally two types of networks - directed and undirected. In undirected networks, edges do not have a directionality associated with them. In directed networks, they do.
Can you think of any others?
The key questions here are as follows. How do we...:
It is my hope that when you leave this tutorial, practically, you will be equipped to:
From a broader perspective, I hope you will be able to:
Much of this work is inspired by Prof. Allen Downey (Olin College of Engineering) and Prof. Jukka-Pekka Onnela (Harvard School of Public Health).
Hive and Circos Plots' original inventor is Martin Krzywinsky of the BC Genome Sciences Center.
Circos plots were implemented with help from Justin Zabilansky (MIT).
Many thanks to the PyCon Rehearsal class for providing feedback on the material.
In this tutorial, we will go through two data sets.
The first one is a small-scale, synthetic social network between 30 individuals, to illustrate some of the basic concepts when constructing and analyzing networks. I will use this data set for the first half of the tutorial.
The second one is a larger-scale bicycle sharing data set, publicly available on the Divvy website, but also included with this tutorial. You will use this data set during the free hacking time.
We will also be constructing our own name-knowledge network in-class, as well as a city-people bipartite graph.
In [ ]: